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C. ROBINSON TOMPKINS 

Abstract. We define a morphism based upon a Latin square that generalizes the Thue-Morse morphism. 
We prove that fixed points of this morphism are overlap-free sequences, generalizing results of Allouche - 
Shallit and Frid. 



1. Introduction 

In his 1912 paper, Axel Thue introduced the first binary sequence that does not contain an overlap [7]. 
It is now called the Thue-Morse sequence: 

01101001100101101001011001101001 .... 

An overlap is a string of letters in the form cxcxc where c is a single letter and x is finite string that is 
potentially empty. Overlaps begin with a square, namely ww where w = cx as given above. It is easy to 
observe, as Thue did, that any binary string of four or more letters must contain a square. 

There are several ways to define the Thue-Morse sequence [I]. We will derive it as a fixed point of a 
morphism. Let E be an alphabet and let E* U E" be the set of all finite or infinite strings over E. A 
morphism is a mapping 

/i:E*UE u ^E*UE u 

that obeys the identity h(xy) = h(x)h(y), for x a finite string and y 6 E* U E w [1, p. 8]. 
By [1 p. 16], define the Thue-Morse morphism on E = {0, 1} as 

01, for t = 
10, for t = l ' 



(1) n{t) 



The sequence found by applying to the the nth iterate of /x converges to the Thue-Morse sequence, denoted 
fj, u (0), which of course is infinite. In particular, 

u(0) = 01 
M 2 (0) = A*(A»(0)) = A*(01) = A»(0)Ai(l) = 0110 
H 3 (0) = m(m 2 (o)) = "(OHO) = 01101001 

(j, w (0) = 01101001100101101001011001101001.... 

Notice that (fj,(0)) = ^(0) and u(u w (0)) = /i"(0). This second observation says that the Thue-Morse 
sequence is a fixed point of \i p] p. 10]. 

We can identify the binary alphabet of the Thue-Morse sequence with Z/2Z the integers modulo 2. It 
is natural to then generalize it to Z/nZ, by considering the alphabet E = {0, 1, . . . ,n — 1}, and for i 6 E, 
defining the morphism 

(f> n (i) = i + i + 1 . . . i + (n- 1), 

where i is the residue modulo n. Notice that for E = {0,1}, 4>%{i) = In 2000, Allouche and Shallit 

proved that is overlap- free [3] . 

In this paper, we generalize <p n , which is based on the Cayley table of Z/nZ, to Latin squares of arbitrary 
finite size n. We define our morphism based the Latin square, and prove that the fixed point of the Latin 
square morphism is an overlap- free sequence. Note that the Cayley table for Z/nZ is a Latin square, but 
not every Latin square is a Cayley table. 
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2. Latin Square Morphisms produce Tilings 

Allouche and Shallit's morphism can be seen as a mapping of i to the i th row (that begins with i) of the 
Cayley table for Z/nZ. For example when n = 3, we have 

4>3 

0^012 
1 -> 1 2 
2^201 

This suggests a natural generalization to any Latin square. 

Begin with a generic alphabet of n letters, which we may assume to be {1,2,..., n}. Recall that a Latin 
square C is an n x n table with n different letters such that each letter occurs only once in each column and 
only once in each row. We will concern ourself with the Latin squares in which the first column retains the 
natural order of our alphabet (1,2, ... ,n). For n — 3, there are two such Latin squares. The one that does 
not come from Z/3Z directly is 

"13 2" 

2 13. 

3 2 1 

Let Ct denote the t th row of our Latin square C. For each t G E wc define the Latin square morphism by 
£(t) = Ct- For example we can use the above Latin square for n = 3 to define the following morphism, 

( 132, for t = 1 
£(t) = < 213, for t = 2 
[ 321, for t = 3 

Given any t € £, £(t),£ 2 (t),£ 3 (t), . . . converges to a sequence £ u (t), which is a fixed point of the morphism 
I So, 

(2) £(f{t))=f{t) 

In fact every fixed point of £ is of the form £ u (t) for some t 6 E [TJ p. 10]. 
Express the sequence as £ u (t\) = ^2*3 . . ., so 

r(h) - £{f{ti)) = £(ht 2 t 3 ...) = KtJlfoMta) ■ ■ ■ 

Thus, we have a tiling of our sequence (and of the natural numbers) by the rows of our Latin square C. 
Again, in terms of our example where n = 3 we have three tiles 132, 213, and 321 and so 

f(l) = 132321213321213132...= 1 132 1 321 1213 1 321 1 213| 132| . . . . 

Now, consider the subsequence created by taking the first letter of each tile. Notice that this sequence is in 
fact our original sequence. Thus our sequence contains itself as a subsequence. These two observations, our 
sequence as a tiling and our sequence equaling a subsequence of itself, will be critical for the proof of our 
main result. 

3. Overlap-Free Latin Square Sequences 
In this section we prove our main result. 

Theorem 3.1. Let S = {1, 2, . . . , n}, and let C be an n x n Latin square using the letters from E, with the 
first column in its natural order. For an arbitrary ( € E, let Ct denote the row of C corresponding to t in 
the first column. If we define the Latin square morphism as 

£(t) = C t , 

then we have that for any t £ S, £ u (t) is an overlap-free sequence. 

Remark. The Latin square for n — 3 above can be seen to be the Cayley table for Z/3Z with the last two 
columns transposed. Frid has shown that all morphisms based upon such Latin squares for Z/rtZ produce 
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overlap-free sequences as their fixed points [6] . Of course not every Latin square comes from a group Cayley 
table. For an example of a Latin square that is not a group Cayley table see below [H p. 27]. 

1 2 3 4 5 6 

2 1 6 3 4 5 

3 4 5 2 6 1 

4 5 1 6 2 3 

5 6 4 1 3 2 

6 3 2 5 1 4 

Proof. Let £ u (t\) — tit 2 t 3 ... so the j th letter in the sequence is tj. Similarly, the m th tile in the sequence 
is T m . We will be also using the notion of length of a string of letters, meaning the number of letters in a 
string. For an arbitrary string w the length of w will be denoted \w\. Use r to denote the location of tj on 
its tile T m , so j = (to — l)n + r with \T m \ = n and r £ {1,2,..., n}. 

Assume for a contradiction that £ u (ti) contains an overlap; moreover that exexe is the shortest overlap 
in £^(ti). Write £"(t{) = AcxcxcB, where c is a single letter, x is a finite string with |cx| > n, A is a finite 
string, and B is the infinite tail of our sequence. We have that |cx| > n (bound by the length of the tiles) 
because each tile is a permutation of 1, 2, . . . , n, and we cannot have two of the three copies of c contained 
in one tile. Our subscripts place this overlap in our sequence. For i G {1, 2, 3}, let ji denote the subscript of 
the i th c. Thus, 



c — tj 1 — tj 2 — tj 3 

X = tjx + l ' ' ■ tj 2 — 1 = Ij 2 + 1 ' ' ' tj 3 — 1 



A — t\ ■ ■ ■ tj 1 -% 

n — 7 ■ — 

(3) 

B = ^3 + 1^3 + 2^3+3 • ' ' ) 

Our argument proceeds as follows: there are two cases \ac\ ^ (mod n) and |cx| = (mod n). In the 
first case we use the fact that we have a tiling of £"(t\) by the rows of a Latin square, to show that the 
overlap exexe is not possible. In the second case, when |cx| = (mod n), we argue based upon the fact 
that £ u (ti ) contains itself as a subsequence that the existence of the overlap exexe leads to the existence of 
a shorter overlap, and thus a contradiction. 

3.1. Case 1: |cx| ^ (mod n). For each i 6 {1, 2, 3}, let 7*j 6 {1,2,..., n} such that n = ji (mod n). In 
other words tj i is the r.^ letter in its tile in £"(ti). Also, we will refer to the tile containing tj i as T mi . It 
is now possible to write the length of cx as |cx| = r 2 — r± = r-j — r 2 (mod n). So, 

(4) r3 = 2r 2 — r\ (mod n). 

3.1.1. Six Cases. Since r 2 — T\ = |cx| ^ (mod n) there are two main cases that we will first consider: 
n < 7*2 and r 2 < r\. However, for the explicit details of our conclusions we will consider all six of the 
following possibilities depending on the value of 7"3, 

7*1 < r 2 < r 3 
7*3 < r 2 < 7*1 
7*1 < 7*3 < 7*2 
r 3 < 7*1 < 7*2 
r 2 < 7*1 < 7*3 
7*2 < 7*3 < 7*1 

The equalities on the left arise out of equation (4) and the fact that the integer 27*2 — t*i satisfies, —n < 
27*2 — t*i < 2n. This means that 7*3 is the element in the set {2t*2 — 7*1 + n, 2r 2 — 7*1, 27*2 — 7*1 — n} that lies in 
the interval < 7*3 < n. Notice that 7*3 = 2r2 — r\ in both cases when T\ < r 2 and r 2 < r\. 

3.1.2. G and the beginning of each cx. When 7*1 < 7*2, we pick G C £ to be the last r 2 — 7*1 letters in T mi 
such that G has no specific order and G 7^ 0. Of course, the remainder of the letters in T mi are in G, the 
complement of G. Notice that this puts c = tj ± G G. By equating the letters in T rni with the corresponding 
letters in tj 2 3ttj 3 , we find that the last n — r 2 + 1 letters of T m2 (starting with c = tj 2 ) are in G. Also, we 
find that the first r 2 — 7*1 letters of T m2+ i are G. 

When r 2 < 7*1 , we pick G C S to be the last 7*1 — 7*2 letters in T m2 such that G has no specific order and 
G 7^ 0. Obviously, the remainder of letters in T m2 must be those that make up G again placing c = tj 2 G G. 



7*3 = 2r 2 - n 
7*3 = 2r 2 7*1 
7*3 = 2r 2 - 7*1 + 
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By equating the letters in T m2 with the corresponding letters in tj 1 xtj 2 we find that the last n — r\ + 1 letters 
of T mi (starting with c = tj x ) are in G. Also, we find that the first n — r 2 letters of T mi+ \ are G. 

We have discussed the appearance of G and its complement G in the beginning of each ex. So, we set 
forth to describe G and G at the end of each ex. 

3.1.3. Following G through the overlap. It is a basic observation that because each tile is a permutation of 
the letters in E, each tile can be partitioned into G and its complement G. It is fundamental to our argument 
that because of the equality tj t xt j2 = exc = t J2 xtj 3 , the letters in G form a contiguous collection of elements 
in each tile involved in our overlap excluding T mi (each of which will need further description) , cither the 
beginning or the ending of each tile. The idea involved in following G through the overlap is quite simple, 
we illustrate it in one particular case r\ < r 2 < r 3 . 

We have explicitly described the location of G at the beginning of each ex. We will now use our example 
T\ < r 2 < r 3 to show to the reader how the tiling of our sequence can be used to find the location of G at 
the end of each ex. In doing so, we will refer to Figure 1. 

In Figure 1, we have displaced the overlap from our sequence (represented by the continuous solid hori- 
zontal line). We have also split our overlap in half leaving T m2 intact for equality purposes. We have placed 
tj 1 'xtj 2 over tj 2 xtj 3 with tj 1 directly over tj 2 and tj 2 directly over tj 3 so that we can see equality of terms 
simply by looking straight up or straight down (displayed by vertical arrows). The set of letters G is repre- 
sented by a horizontal solid line above and below our sequence line, and the set of letters G is represented 
by horizontal dotted lines above and below the sequence line. Also, notice that we have drawn in the edges 
of the tiles with smaller vertical black lines. 



|:: ::: ::::=| 



G G G 



u 



H 1 1 1 h 



M — M 

G 



Figure 1: The situation when n < r 2 < r 3 . 

Now notice that by using the tiles we can equate letters in tj 1 'xtj 2 with tj 2 ~xtj 3 all the way through the 
overlap. Since we know that G occurs in the first r 2 — r\ letters of T m2 +i, then G is the last n — (r 2 — n) 
letters of T m2 + 1. This causes G to be the first n — (r 2 — ri) letters of T mi+ i, and thus G appears in the 
last r 2 — ri letters of T mi+1 . Thus we can conclude that G occurs in the last r 2 — n letters of all the tiles 
in tj 1 xtj 2 except for T m2 . We can also conclude that G occurs in the first r 2 — r\ letters of all the tiles in 
tj 2 xij 3 up through T m3 _i. We can approach every case by the same process. 

3.1.4. G and how each ex ends. We now will explain the conclusions for the six possible cases that we defined 
earlier, leaving the actual drawing to the reader. 

Case r\ < r 2 < r 3 (as seen in Figure 1). After we follow G through the overlap, we find that G occurs in 
the first r 2 — r\ letters of T m3 . Recall r 3 = 2r 2 — n. So, we have that the next r 3 — (r 2 — r\) = r 2 letters of 
T m3 are not in G. Notice that the size of G, r 2 — r\, added to r 2 make up all of r 3 . This places the boundary 
between T m2 _i and T m2 exactly in line with the end of G in T m3 and the beginning of G. We then equate 
the first letters in T m3 with those in T m2 to find that G occurs nowhere in T m2 . So now, we have described 
T m2 fully. Earlier we defined G such that G occurred from tj 2 to the end of the tile, and we have just shown 
that the first r 2 letters of T m2 (which includes tj 2 ) must be in G. So G does not appear in anywhere in T m2 , 
and since G ^ 0, we must have a contradiction. 

Cases ri < r 3 < r 2 and r 3 < r\ < r 2 . After we follow G through the overlap, we find that G occurs in 
the first r 2 — n letters of T m3 _i. So, G occurs in the final n — (r 2 — r\) letters of T m3 _i causing the first 
n — (r 2 — ri) letters of T m2 to be G. Notice that r 2 = [n — (r 2 — n)] + r 3 . So the boundary between G and 
G in T m2 coincides with the boundary between T m3 _i and T m3 . This means that tj 2 € G, but we assumed 
that c ^ G earlier which is a contradiction. 

Case r 3 < r 2 < ri. After we follow G through the overlap, we find that G occurs in the last r\ — r 2 letters 
of T m3 _!. This causes G to occur in the first r\ — r 2 letters of T m2 by equality of tj 1 xtj 2 and tj 2 x.tj 3 . To 
describe the remaining letters of T m2 up to and including tj 2 consider r 2 — (ri — r 2 ) = r 3 . So G occurs in 
the next r 3 letters after G. Thus we have that G is repeated twice in T m2 so we have our contradiction. 
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Cases r 2 < v\ < r3 and r 2 < rs < n. After we follow G through the overlap we find that G occurs in 
the first n — r 2 letters of T TO2 _i. This causes G to occur in the final n — (ri — r 2 ) letters of T m2 _i and thus 
the first n - (rr - r 2 ) letters of T m3 . Since r 2 = r 3 - [n - (n - r 2 )], we see that the left boundary of T m>2 
coincides with the right boundary of these first n — (ri — r 2 ) letters of T m3 . In particular, this means that 
the last n — r 2 letters of T m3 , which include c, are in G. But, this contradicts the fact that c ^ G. 

3.2. Case 2: |cx| = (mod n). We begin by considering some tt G S n the symmetric group on n letters. 
Note that we may apply tt to any string by requiring tt to act on each individual letter, so 7r(iii 2 . . . t s ) = 
7r(ti)7r(f 2 ) . . . n(t s ). Thus tt can be treated as a morphism. Moreover, tt : X* — > S* is an invertible map 
because 7r G S„. Thus w G S* contains an overlap if and only if tt{w) G S* contains an overlap. 

Define the function d( 0i „) : N — > N by d( a ,„)(m) = (m — l)n + a. Now if we let M = (t s ) be a sequence, 
then define the sequence given by the function D( a ^(M) to be the subsequence (td (an) (s)) of M. So for 
ie{l,2,...,n} arbitrary we have that 

D (i,n)(^(il)) = ti*i+n**+2n 

Define : S £ with G 6" n , such that if jC tl = {£i,i 2 , . . . . . . ,t n }j = U- Recall that Ct 

refers to the t th row of our Latin square C. So we have that Wi maps each letter in the first column of our 
Latin square, to the i th letter of its corresponding row. Now, we want to show that 7Tj(f^(i)) = Du^ n \{i u \t)) 
for all i G S. So take 

i>(i,„)(^(ti)) - D {i!n) (i(e"(h)) 

= %„)(^iK(i 2 K(i 3 )---) 

= 7ri(ii)7ri(t 2 )7rj(i3) • • • 

= M^(h)). 

Since 7ri G S n is invertible we can conclude that D^^i^ {t\)) contains an overlap if and only if £?(ti) 
contains an overlap. 

Since |cx| = (mod n) pick i = ji = ji = js (mod n). By applying D/ i n \ to (4) we obtain 

D(i,n){^ (^l)) — -A-i tj 1 Xj tj 2 Xj tj 3 Bi 

where 

j4j = -C(i,n)(^) = titi+nti+2n • • ■ 7 
x i — -D(i,n)( x ) tji+ntj 1 +2n---tj 1 + (m—l)n 

= tj2+ntj2+2n ■ ■ ■ tj 2 +(m—l)ni 
Bi — D^n^B) = tj 3+n tj 3+ 2 n tj 3 +3n---, 

and m = cx|/n. Observe that £>( i>n )(^ w (fi)) contains a shorter overlap which implies that ^ w (ti) also 
contains a shorter overlap, a contradiction of our assumption. □ 
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